follower type
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (14 more...)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- (16 more...)
- Transportation (1.00)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment (0.93)
- (2 more...)
Learning in Structured Stackelberg Games
Balcan, Maria-Florina, Fragkia, Kiriaki, Harris, Keegan
We study structured Stackelberg games, in which both players (the leader and the follower) observe contextual information about the state of the world at time of play. The leader plays against one of a finite number of followers, but the follower's type is not known until after the game has ended. Importantly, we assume a fixed relationship between the contextual information and the follower's type, thereby allowing the leader to leverage this additional structure when deciding her strategy. Under this setting, we find that standard learning theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Instead, we introduce a new notion of dimension, the Stackelberg-Littlestone dimension, which we show characterizes the instance-optimal regret of the leader in the online setting. Based on this, we also provide a provably optimal learning algorithm. We extend our results to the distributional setting, where we use two new notions of dimension, the $γ$-Stackelberg-Natarajan dimension and $γ$-Stackelberg-Graph dimension. We prove that these control the sample complexity lower and upper bounds respectively, and we design a simple, improper algorithm that achieves the upper bound.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Education > Educational Setting > Online (0.68)
- Leisure & Entertainment > Games (0.67)
- Transportation > Infrastructure & Services > Airport (0.45)
- Transportation > Air (0.45)
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information
Balcan, Maria-Florina, Bernasconi, Martino, Castiglioni, Matteo, Celli, Andrea, Harris, Keegan, Wu, Zhiwei Steven
We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers. In every round the leader observes contextual information and commits to a mixed strategy, after which the follower best-responds. We provide learning algorithms for the leader which achieve $O(T^{1/2})$ regret under bandit feedback, an improvement from the previously best-known rates of $O(T^{2/3})$. Our algorithms rely on a reduction to linear contextual bandits in the utility space: In each round, a linear contextual bandit algorithm recommends a utility vector, which our algorithm inverts to determine the leader's mixed strategy. We extend our algorithms to the setting in which the leader's utility function is unknown, and also apply it to the problems of bidding in second-price auctions with side information and online Bayesian persuasion with public and private states. Finally, we observe that our algorithms empirically outperform previous results on numerical simulations.
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.35)
Regret Minimization in Stackelberg Games with Side Information
Harris, Keegan, Wu, Zhiwei Steven, Balcan, Maria-Florina
Algorithms for playing in Stackelberg games have been deployed in real-world domains including airport security, anti-poaching efforts, and cyber-crime prevention. However, these algorithms often fail to take into consideration the additional information available to each player (e.g. traffic patterns, weather conditions, network congestion), a salient feature of reality which may significantly affect both players' optimal strategies. We formalize such settings as Stackelberg games with side information, in which both players observe an external context before playing. The leader commits to a (context-dependent) strategy, and the follower best-responds to both the leader's strategy and the context. We focus on the online setting in which a sequence of followers arrive over time, and the context may change from round-to-round. In sharp contrast to the non-contextual version, we show that it is impossible for the leader to achieve good performance (measured by regret) in the full adversarial setting. Motivated by our impossibility result, we show that no-regret learning is possible in two natural relaxations: the setting in which the sequence of followers is chosen stochastically and the sequence of contexts is adversarial, and the setting in which the sequence of contexts is stochastic and the sequence of followers is chosen by an adversary.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Indiana (0.04)
- Transportation (1.00)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment (0.93)
- (2 more...)
Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense
Sengupta, Sailik, Kambhampati, Subbarao
The field of cybersecurity has mostly been a cat-and-mouse game with the discovery of new attacks leading the way. To take away an attacker's advantage of reconnaissance, researchers have proposed proactive defense methods such as Moving Target Defense (MTD). To find good movement strategies, researchers have modeled MTD as leader-follower games between the defender and a cyber-adversary. We argue that existing models are inadequate in sequential settings when there is incomplete information about a rational adversary and yield sub-optimal movement strategies. Further, while there exists an array of work on learning defense policies in sequential settings for cyber-security, they are either unpopular due to scalability issues arising out of incomplete information or tend to ignore the strategic nature of the adversary simplifying the scenario to use single-agent reinforcement learning techniques. To address these concerns, we propose (1) a unifying game-theoretic model, called the Bayesian Stackelberg Markov Games (BSMGs), that can model uncertainty over attacker types and the nuances of an MTD system and (2) a Bayesian Strong Stackelberg Q-learning (BSS-Q) approach that can, via interaction, learn the optimal movement policy for BSMGs within a reasonable time. We situate BSMGs in the landscape of incomplete-information Markov games and characterize the notion of Strong Stackelberg Equilibrium (SSE) in them. We show that our learning approach converges to an SSE of a BSMG and then highlight that the learned movement policy (1) improves the state-of-the-art in MTD for web-application security and (2) converges to an optimal policy in MTD domains with incomplete information about adversaries even when prior information about rewards and transitions is absent.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Arizona (0.04)
- Europe > France (0.04)
Using Game Theory for Los Angeles Airport Security
Pita, James (University of Southern California) | Jain, Manish (University of Southern California) | Ordóñez, Fernando (University of Southern California) | Portway, Christopher (University of Southern California) | Tambe, Milind (University of Southern California) | Western, Craig (University of Southern California) | Paruchuri, Praveen (Intelligent Automation, Inc.) | Kraus, Sarit (Bar Ilan University and University of Maryland)
Security at major locations of economic or political importance is a key concern around the world, particularly given the threat of terrorism. Limited security resources prevent full security coverage at all times, which allows adversaries to observe and exploit patterns in selective patrolling or monitoring, e.g. they can plan an attack avoiding existing patrols. Hence, randomized patrolling or monitoring is important, but randomization must provide distinct weights to different actions based on their complex costs and benefits. To this end, this paper describes a promising transition of the latest in multi-agent algorithms into a deployed application. In particular, it describes a software assistant agent called ARMOR (Assistant for Randomized Monitoring over Routes) that casts this patrolling/monitoring problem as a Bayesian Stackelberg game, allowing the agent to appropriately weigh the different actions in randomization, as well as uncertainty over adversary types. ARMOR combines two key features: (i) It uses the fastest known solver for Bayesian Stackelberg games called DOBSS, where the dominant mixed strategies enable randomization; (ii) Its mixed-initiative based interface allows users to occasionally adjust or override the automated schedule based on their local constraints. ARMOR has been successfully deployed since August 2007 at the Los Angeles International Airport (LAX) to randomize checkpoints on the roadways entering the airport and canine patrol routes within the airport terminals. This paper examines the information, design choices, challenges, and evaluation that went into designing ARMOR.
- North America > United States > California > Los Angeles County > Los Angeles (0.60)
- Oceania > Australia (0.04)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- (6 more...)
- Transportation > Infrastructure & Services > Airport (1.00)
- Transportation > Air (1.00)
- Leisure & Entertainment > Games (1.00)